Query Processing Techniques for Partly Inaccessible Distributed Databases (Abstract)
نویسندگان
چکیده
One important characteristic of distributed database management systems (DBMS) is that due to network or machine failure the environment may become partitioned into sub-environments that cannot communicate with each other. However, there are various application areas where the individual sub-environments should remain operable even in such situations [2, 5]. In particular queries to the database should be processed in an appropriate way [4]. To this end, the nal and all intermediate results of queries in distributed DBMS must be regarded as potentially vague or incomplete. As a consequence, we have to deal with vague values and vague collections during query processing. Whereas vague values have been addressed in various papers dealing with null values in the relational context [1, 3], vague collections stemming from inaccessibility represent a novel research topic. In this work we present suitable representations for vague sets, vague multisets and vague lists as well as appropriate adaptations of the usual query language operators for these representations. Our representation for vague sets consists of an enumerating and a descriptive part: The enumerating part in turn consists (1) of the explicit part of a lower bound of the desired set and (2) the explicit part of an upper bound of the desired set. Elements in (1) are known to be surely contained in the desired query result while elements in (2) may be in the query result. The descriptive part is a three-valued logical predicate specifying the missing elements of the vague set. These elements complete the enumerating parts to actual bounds. The descriptive part can be employed to check for potential candidates if they belonged to the desired query result. This is applicable, because in the case of inaccessibility not only inaccessible elements can be missing in the result of a query, but also accessible elements that could not be reached by the normal query evaluation. An example for such a situation may arise when we want to traverse a path built by a multi-step relationship which is disconnected due to inaccessibility. The adaptations of the query operators to vague sets de ne the explicit parts of the vague result set as well as the descriptive part. This way we implicitly de ne how the descriptive part of the result of a query can be calculated. Furthermore the adaptations of the operators employ the descriptive parts of the operands to revise their enumerating parts during query processing. By that way our approach can signi cantly improve the correctness of the vague result set. When we are concerned with vague multisets, there are not only missing elements, but also the number of occurrences of enumerated elements can be vague. In our approach the rst kind of vagueness is dealt with quite analogously to vague sets, while the second kind of vagueness is treated as follows: the number of occurrences of each individual element is represented by a set of possible numbers of occurrences instead of a single number. The adapted query language operators have to combine these vague occurrences when computing a vague result multiset. Vague lists are more complicated, because when sorting a vague set or a vague multiset, even the arising order can be vague. Our approach comprises an adequate representation for vague lists, too. Our representation of vague lists minimizes the vagueness in the order of the list by enforcing the typical requirements for an order such as irre exivity, transitivity and trichotomy. The vagueness concerning the uncertain membership of elements of the vague lists is dealt with analogously to vague sets and vague multisets. Additionally we consider vague aggregate functions. In the case of inaccessibility the evaluation of an aggregate function must also be done three-valued resulting in a set of possible aggregate values. In general this evaluation can cause considerable e ciency problems. To overcome these problems, we identify two special classes of aggregate functions, that can be computed e ciently even in the case of vagueness; these classes are called set-monotone aggregations and element-monotone aggregations. Finally we describe how our hybrid representations for vague collections can serve as the basis for the implementation of a query language for a distributed database. Compared to other approaches | e.g. in the eld of null values | the main contribution of our approach is the use of a descriptive part which describes and restricts the missing elements in a vague collection. This descriptive part can be employed in various situations to improve the enumerating parts of the query result.
منابع مشابه
A Closed Approach to Vague Collections in Partly Inaccessible Distributed Databases
Inaccessibility of part of the database is a research topic hardly addressed in the literature about distributed databases. However, there are various application areas, where the database has to remain operable even if part of the database is inaccessible. In particular, queries to the database should be processed in an appropriate way. This means that the nal and all intermediate results of a...
متن کاملRelational Databases Query Optimization using Hybrid Evolutionary Algorithm
Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...
متن کاملProcessing Queries over Distributed XML Databases
The increasing volume of data stored as XML documents makes fragmentation techniques an alternative to the performance issues in query processing. Fragmented databases are feasible only if there is a transparent way to query the distributed database. Fragments allow for intra-query parallel processing and data reduction. This paper presents our methodology for XQuery query processing over distr...
متن کاملReview of Relational Algebra for Query Processing in Dynamic Distributed Federated Databases
This paper reviews the coverage of formal Relational Algebra as it applies to distributed, federated databases in varying network topologies. The review shows that a number of Relational Algebra extensions allow distributed relations and federation of heterogeneous database schema. More concrete physical Relational Algebra extensions support access plans for multi-database query processing but ...
متن کاملA Methodology for Query Processing over Distributed XML Databases
The constant increase in the volume of data stored as native XML documents makes fragmentation techniques an important alternative to the performance issues in query processing over these data. Fragmented databases are feasible only if there is a transparent way to query the distributed database, without the need of knowing the fragmentation details and where each fragment is located. This pape...
متن کاملReview of Relational Algebra for Dynamic Distributed Federated Databases
This paper reviews the coverage of formal Relational Algebra as it applies to distributed, federated databases in varying network topologies. The review shows that a number of Relational Algebra extensions allow distributed relations and federation of heterogeneous database schema. More concrete physical Relational Algebra extensions support access plans for multi-database query processing but ...
متن کامل